Evolutionary Prisoner's Dilemma Game in Flocks 



Zhuo Chen 1 *, Jianxi Gao 1 , Yunze Cai 2 and Xiaoming Xu 1,2 ' 3 
1 Shanghai Jiao Tong University, Shanghai, China 
2 University of Shanghai For Science and Technology, Shanghai, China 
3 Shanghai Academy of Systems Science, Shanghai, China 
* j eff chen_ch@yaho o . com . cn 

May 19, 2010 



Abstract 

We investigate an evolutionary prisoner's dilemma game among self- 
driven agents, where collective motion of biological flocks is imitated 
through averaging directions of neighbors. Depending on the temptation 
to defect and the velocity at which agents move, we find that cooperation 
can not only be maintained in such a system but there exists an optimal 
size of interaction neighborhood, which can induce the maximum cooper- 
ation level. When compared with the case that all agents do not move, 
cooperation can even be enhanced by the mobility of individuals, provided 
that the velocity and the size of neighborhood are not too large. Besides, 
we find that the system exhibits aggregation behavior, and cooperators 
may coexist with defectors at equilibrium. 
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1 Introduction 

Cooperation is commonly observed in genomes, cells, multi-cellular organisms, 
social insects, and human society, but Darwin's theory of evolution implies 
fierce competition for existence among selfish and unrelated individuals. In past 
decades, much effort has been devoted to understanding the mechanisms behind 
the emergence and maintenance of cooperation. In this context, the prisoner's 
dilemma game is a widely used model to illustrate the conflict between selfish 
and cooperative behavior. 

The traditional prisoner's dilemma (PD) game is a two-player game, where 
each player can choose either cooperation (C) or defection (D). Mutual cooper- 
ation pays each a reward R, while mutual defection brings each a punishment 
P. If one player chooses to cooperate while the other prefers to defect, the 
cooperator obtains the sucker's payoff S and the defector gains the temptation 
T. The four payoff values satisfy the following conditions: T > R > P > S and 
2R > S + T. According to the inequalities, defection is the optimal strategy 
to maximize payoff for a selfish player in a one-shot game, no matter what the 
opponent dose. But the total income of two defectors is lower than that of two 
cooperators. Hence the dilemma arises, and defection is evolutionarily stable in 
a well- mixed population [I]. 
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The spatial PD games have attracted much attention since Nowak and 
May reported the stable coexistence of cooperators and defectors in a two- 
dimensional lattice [2J. After that, many works have been done to add ran- 
domness to the deterministic game dynamics. For example, the noise can be 
introduced based on the payoff difference, which allows an inferior strategy to be 
followed with certain probability [3J. The mapping of game payoffs to individ- 
ual fitness can follow different distributions, which accounts for social diversity 
[3]. And in the dynamic preferential selection model, the more frequently a 
neighbor's strategy is adopted by the focal player, the larger probability will be 
chosen to refer to in the subsequent rounds [5]. Besides, networks describing 
connections among individuals have also been also extended from lattices to 
complex networks [BJ [7] |H1 HI [TO] • For more details about spatial evolutionary 
games, please see Ref. [TTJ [TO] [TO] and references therein. 

In spatial games mentioned above, players are located on the vertices of the 
network, and edges among vertices determine who plays with whom. Often the 
network is assumed to be static. However, in real social systems, the network size 
may continuously change as individuals join or quit, and the network structure 
can also evolve as links are created or broken. It has been reported that co- 
evolution constitutes a key mechanism for the sustainability of cooperation in 
dynamic networks [H [TOJ [TOJ [TOJ [TOJ [TO] . 

For a network, the movement of individuals may either change its size or its 
structure. For example, when people drive, cell phones connect with different 
base stations in the mobile communication network. And moving house brings 
one new neighbors in the acquaintance network. In fact, the motion of individ- 
uals is an important characteristic of the social network |20| , and the patterns 
of human mobility have drawn much attention in the past years (21] [22] . When 
the spatial structure has been introduced, it is natural to consider the evolution 
of cooperation in mobile individuals. 

By intuition the introduction of mobility would lead to the dominance of 
defection because mobile defectors can expect more cooperators to employ than 
that of the static network, and escape retaliation of former partners by running 
away. Yet, the correlation between cooperation and mobility is more complex 
than intuition. Mobility could affect the origin of altruism, while the rise of 
altruism cost would lead to an evolutionary reduction of mobility [53]. With a 
win-stay, lose-shift rule cooperation would be evolutionary stable under gener- 
alized reciprocity [23]- Further, in agent-based models, mobility of individuals 
can be involved explicitly as the movement of agents. "Walk Away", a sim- 
ple strategy of contingent movement, can outperform complex strategies under 
a number of conditions 25 . And success-driven migration may promote the 
spontaneous outbreak of cooperation in a noisy world, which is dominated by 
selfishness and defection [26] . Even in a blind pattern of mobility, cooperation 
is not only possible but may also be enhanced for a broad range of parameters, 
when compared with the case that all agents never move [27] [28] [29] . 

In the present work we study the evolution of cooperation among mobile 
players, which are allowed to move in a two-dimensional plane without periodic 
boundary conditions. The movement of every agent is non-contingent, imitating 
the direction alignment process in biological flocks. We find that there exists 
an optimal size of interaction neighborhood, which can induce the maximum 
cooperation level. When compared with the case that all agents do not move, 
cooperation can not only be maintained but even be enhanced by the movement 
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of players. We also investigate the dependence of the cooperator frequency on 
the density of agents, and the coexistence of different strategics is illustrated. 



2 The Model 

Let Xi(t) and 9i(t), i = 1,2,. .JV, denote the position and moving direction of 
the agent i at time t, t = 0, 1, 2..., respectively. Assume that each agent has the 
same absolute velocity v. When t — 0, all agents are randomly distributed in 
an L x L square without boundary restrictions, and their directions, #i(0), are 
uniformly distributed in [0, 2ir). The position of each agent is updated according 
to 

x l {t + l)=x l {t) + V l {t)At, (1) 

where Vi(t) is characterized by v and #i(t). In addition, At is set to 1 between 
two updates on the positions. 

In biological systems, such as flocks of birds and schools of fish, individuals 
tend to align their moving directions with that of nearby neighbors. To simulate 
the process of direction alignment in flocks, the angle 9i(t) of agent i is updated 
according to the average direction of its neighbors. Then 

6i(t + 1) = arctan ^ - — --, (2) 

COs9i(t) + LjW s (t) COS0y(t) 

where Wi(t) denotes the neighbors set of the agent i at time t. 

In real populations, people are believed to interact much more with their 
neighbors than with those who are far away [3]. Based on this point, when 
players are located on the nodes of a fixed network, interactions often take place 
among immediate players. When players are kept moving, distances can be used 
to find neighbors close to the focal one [3D]. Note in the Vicsek model [3T], the 
neighbors set Wi (t) is defined as agents within the circle of radius r centered at 
the agent i. To exclude the effects from fluctuations of the neighborhood size, 
we assume that each agent will only interact with k nearest neighbors at time 
t. Thus Wi(t) can be written as 

Wi(t) = argmin k {\\xi(t) - Xj (t)\\,j e N,j ? i}, (3) 

where the function argmink{»} means to find k smallest elements given in {•}, 
and ||*|| denotes the Euclidean distance between j and i in the two-dimensional 
space. In simulations, distances between the focal agent and the others are 
calculated at first. Then they are sorted in an ascending order, which means 
xi < X2 < X3 < ... < xjv-i. Here x denote the distance between i and j, and the 
suffix represents its order. If 11 ^ i 2 / %3 7^ ■■■ / %k 7^ sCfc+i, k nearest agents 
are chosen as neighbors. If x m — x m+ \ = ... = x m + n and 1 < m < k, k — m + 1 
agents are randomly selected among n + 1 agents when (Jc — m) < n. The sorting 
process leads a directed interaction network, however. It means i € Wj(t) does 
not imply j € Wi{t). Here, Wi(t) and Wj(t) denote the neighbors sets of % and 
j at time t respectively. 

Next we introduce the evolutionary rules of our game. Initially all players 
are randomly assigned one strategy of the PD with equal probability. The 
strategy Sj of each player can be denoted by an unit vector (1,0) T or (0, 1) T , 
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which indicates cooperation or defection respectively. At each time step, each 
player plays the PD game with his neighbors Mi(t), accumulating a payoff 
Pi = Y^jeMi(t) s fAsj- Here we assume that Mj(i) = Wj(i). And following the 
common practices [5] , the payoff matrix of the PD takes a rescaled form as 

where 1 < 6 < 2. Then, every player chooses the strategy that gains the highest 
payoff among itself and its neighbors at the next time step [2]. Though the 
evolution of strategies and the movement of agents are characterized by two 
time scales respectively, they are treated as the same here. It means that every 
agent modifies its position and direction after strategy update. This process is 
repeated until the system reaches equilibrium. 

In our model, distances among each agent determine the network of contacts, 
and the agents continuously change their positions. As a result, the neighbors 
may be different at each step, though the size of neighborhood is fixed. To char- 
acterize the evolution of the interaction network, we calculate the new neighbors 
that all agents meet at time t as 

N 

n(t)=^|T^(i)-W i (t)f|W i (t-l)| ) (5) 

i=l 

where | • | represents the set size. 

Fig. Q] shows typical evolutions of n(t), which is divided by TV for normal- 
ization, and the frequency of cooperators fc. One can find that n(t) decreases 
to when t > 200. For comparison, we also plot the evolution of average nor- 
malized velocity V a defined in Ref. [21] ■ As the decrease of n(t), V a also reaches 
a steady value, which indicates a stable distribution of moving directions of the 
agents. These findings imply that given a sufficient relaxation time, each agent 
owns a fixed neighborhood. Later, we will show that without periodic boundary 
conditions, the system forms many disconnected components after a long run 
time. Thus the variation of neighbors, if any, would be constrained within a 
fraction of agents in the population. 
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Fig. 1. Representative time evolutions of fc, n and V a for b = 1.2, k — 15 and 
v = 0.05. 
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The simulations were carried out in a system with N = 500, L = 7. In 
each realization, we first check whether the interaction network is fixed after 
a suitable relaxation time. The relaxation time is varied from 5000 to 10 5 
generations, and the longer run time corresponds to small neighborhood size fc 
or velocity v. If n(t) < 1, and this condition can hold for q = 1000 time steps, 
the network would be treated as a static one. Then we evaluate the frequency 
of cooperators at equilibrium by averaging over the last 1000 generations. All 
data points shown in each figure are acquired by averaging over 400 realizations 
of independent initial states. 



3 Results and Discussions 

Fig. [5] illustrates the dependence of the frequency of cooperators fc on the 
temptation b in the stationary state for different sizes k of neighborhood with a 
fixed absolute velocity v. Under a fixed v and fc, fc shows a step structure, and 
gradually decreases as the increase of b. However, the size k of neighborhood can 
strongly affect the evolution of cooperation. One can see that cooperators are 
more prone to die out in the case of large k. But before the system is completely 
occupied by defectors, there exists an appropriate k to promote cooperation for 
a fixed b. As shown in Fig. 2(a)| the cooperation level for k — 9 is always 



higher than that for other values of k, if b < 1.35. If 1.35 < b < 1.51, the 
highest level of cooperation can be achieved when k = 3. lib > 1.51, defectors 
dominate the population, no matter the values of k. These findings suggest a 
non-monotonous dependence of the cooperator frequency on the neighborhood 
size fc. Besides, the absolute velocity v also plays an important role in the 
evolution of cooperation. Comparing Fig. 2(b) with Fig. 2(a)| one can find 



that the increase of v leads an apparent drop of fc for fc = 9 or k = 3. But 
for k = 15 or k = 21, the increment of v does not cause many changes to 
the cooperation level. And when v = 0.35, the highest level of cooperation for 
k = 21 is still above 0.6. These findings imply that as the variance of fc, the 
movement of individuals has different influence on fc. 

To investigate the role of the neighborhood size fc, Fig. [3]presents the cooper- 
ator frequency fc as a function of the neighborhood size fc for a fixed temptation 
b. Dai et al. reported the promotion of cooperation through enlarging the size 
of neighborhood among mobile agents [35], where molecular dynamics is used 
to describe repulsion and attraction between agents in flocks. But in our model, 
a resonance-like behavior can be observed: there exists a peak of fc at some 
values of fc. In fact, the same behavior has been found in three typical networks, 
where the density of cooperators peaks at some specific values of the average 
degree |32j . Here our work can be viewed as extensions of previous work to 
dynamical networks. Next we will give a simple explanation for the non-trivial 
relation between fc and fc. On square lattices and regular ring-graphs, fixed 
locations of players provide continuous interactions within local neighborhoods, 
and cooperators can cluster together to resist the invasion of defectors [3J [B] . 
The increment of average degree indeed hampers cooperation, because the well- 
mixed limit is nicely approached for a sufficiently large size of neighborhood 
|33) . When players are kept moving, however, the cluster of cooperators may be 
destroyed by time-variant neighborhoods. The smaller fc is, the longer time the 
system needs to form a fixed interaction network. The increment of fc enhances 
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(a) v=0.05 
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(b) v=0.1 




(c) v=0.15 (d) v=0.35 



Fig. 2. The frequencies of cooperators fc versus the temptation to defect b 
for v = 0.05, 0.1, 0.15 and 0.35 respectively, where the cases k — 21,15,9,3 
correspond to different sizes of neighborhood, and b ranges from 1.01 to 1.53 
with an interval of 0.02. 
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the probability of future encounters between players and their former neighbors. 
As a result, interactions among cooperators can be maintained. But defectors 
can also exploit more cooperators as fc increases, and large values of fc reproduce 
the mean field situation. To promote cooperation, there should be a compro- 
mise between the two limits of k discussed above. That is why the cooperation 
level reaches the maximum only at intermediate values of k. At the same time, 
the positive effect coming from intermediate local connections on cooperation is 
greatly constrained by the absolute velocity v and the temptation b. In Fig. |31 
the value of fc at the peak point decreases as v increases. And when b increases 
to 1.2 in Fig. [3(b)] the curve of fc is almost leveled off for v = 0.2. 




Fig. 3. The frequency of cooperators fc as a function of the size of neighbor- 
hood k for b = 1.05 and b — 1.2 respectively. And k ranges from 3 to 26 with 
an interval of 1. 

To study the effect of the absolute velocity v on the cooperation level, Fig. 



4(a) demonstrates the frequency of cooperators fc as a function of the tempta- 
tion b for different values of v with k = 9. In our model, velocity v measures 
the movement speed of players. One can find that for v > 0.1, the cooperator 
frequency is lower than that for v = 0, and decreases gradually as v increases. 
In fact, when the agents move with a high velocity, they have greater chance 
to contact with different neighbors than that in the case of small v. Before 
the interaction network gets fixed, neighbors of each agent change quite often, 
or might be completely different at each time step. As a result, there is a 
small probability of forming compact clusters of cooperators, which leads to 
the dominance of defectors. For v < 0.01, however, the situation is reversed. 
Compared with the case that agents do not move, the cooperation level is pro- 
moted throughout the whole parameter range of b when v — 0.005 or v = 0.01. 
It suggests that cooperation among mobile individuals is not only possible but 
may even be enhanced, and this finding is in accordance with the previous work 
27, 28, 29 . But such effect relies on the size k of neighborhood, as shown in 
Fig. |4(b)| which presents the dependence of fc on v for different values of k 
with b = 1.17. For k < 9, the cooperation level increases with v, and reaches 
the maximum value around v = 0.01. When v = 0.1, a drop of fc appears. For 
fc = 15 or fc = 21, fc changes little when v increases. This can be explained 
by the occurrence of mean field situation at large values of fc, which offsets the 
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enhancement of cooperation from mobility. 




(a) (b) 



Fig. 4. (a) The frequency of cooperators fc as a function of the temptation b 
for k = 9 with various velocities v. b ranges from 1.01 to 1.41 with an interval 
of 0.02. (b) The frequency of cooperators fc versus the absolute velocity v for 
b = 1.17 with different sizes k of neighborhood. A logarithmic scale is used for 
the X axis. 



The evolution of system relies on the density p of agents at t = 0, which can 
be defined as p — N/L 2 . And it has been reported that there is a optimal region 
of p for cooperation, when the neighbors are chosen according to a prescribed 
distance [25]. Fig. 5(a) shows the combining effect of v and p on the cooperator 
frequency fc for k = 9 and b = 1.07. For a fixed v, one can find that fc 
decreases monotonously as p increases, and the decreasing velocity increases 
with v. Clearly, our finding is different with that reported in Ref. [25], and this 
difference is rooted in the definition of neighborhoods. Here p indicates how 
dense players distribute on the plane when t = 0. In Ref. [25J, p determines 
the average degree < k > of the interaction network, and < k > increases 
with p. Previous work has revealed that moderate values of average degree can 
enhance cooperation [32] . Then the existence of the optimal region of p for 
cooperation becomes understandable. In our model, however, each agent plays 
with a constant number of neighbors. The increasing of p produces a dense 
population, which brings fast change in neighborhoods for the players. And for 
a fixed p, a sufficiently large v would hamper the evolution of cooperation, as 
discussed above. Hence the system shows low values of fc for large p and v. 
Fig. 5(b) sheds more light on the role of k when p increases. When k < 9, the 
increase of p leads an apparent decrease of fc. While for k > 21, variation of 
densities only causes small fluctuations of fc. 

To have an insight into the evolution of cooperation among mobile players, 
Fig. HI provides snapshots of spatial configurations at equilibrium, which is 
obtained in one realization. And to eliminate additional mechanisms that favor 
cooperation, the value of the temptation is near the extinction threshold of 
cooperators. One can find that the system gradually splits into many small 
flocks, in which all the agents move toward a same direction. Because the agents 
are located in a plane without boundary restrictions, they fly apart and never 
meet again. During the process of direction alignment, cooperators can survive 
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Fig. 5. (a) The cooperator frequencies fc versus the absolute velocity v and 
the density p for k = 9, b = 1.07. (b) The cooperator frequencies fc versus the 
density p for v — 0.05, b — 1.05. v ranges from 0.025 to 0.35 with an interval of 
0.025, and p ranges from 5 to 75 with an interval of 5. 

by forming compact clusters. And the two strategies may coexist at equilibrium, 
as shown in the last three figures. One can see that defectors are located on 
the border of flocks, or surrounded by cooperators. For cooperators adjacent 
to defectors, mutual cooperation make their cooperative neighbors earn higher 
payoffs than the income of defectors. According to the best-takes-over rule of 
strategy update, the cooperators will follow the strategies of their cooperative 
neighbors. That is why cooperation can be maintained in population, and this 
mechanism has been found in the lattice structure [SJ. 

4 Conclusion 

To summarize, we investigate the effects of mobility on the evolution of cooper- 
ation in the direction alignment process of flocks. Numerical simulations show 
that cooperation can be maintained in mobile players with simple strategies. 
Depending on the temptation to defect and the velocity at which the agents 
move, there exist an optimal size of interaction neighborhood to produce the 
maximum cooperation level. When compared with the case that all agents do 
not move, the cooperation level can even be enhanced by the mobility of indi- 
viduals, if the velocity and the size of neighborhood are small. The cooperation 
level is also affected by the density p of agents, and fc decreases as the increase 
of p. Moreover, the system exhibits aggregation behavior, and we illustrate the 
coexistence of different strategies at equilibrium. Our work may be relevant for 
understanding the role of information flows in cooperative, multi- vehicle systems 

This work is supported by the Key Fundamental Research Program of Shang- 
hai (Grant No.09JC1408000), the National Key Fundamental Research Pro- 
gram (Grant No.2002cb312200) and the National Natural Science Foundation 
of China (Grant No.60575036). 
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(a) t=0 (b) t=25 (c) t=100 




(d) t=1300(cquilibrium) (e) 




(f) (g) 

Fig. 6. Snapshots of the evolution of cooperation with b = 1.35, k = 9 and 
v = 0.05. Cooperators (red circles) form clusters to resist the invasion of defec- 
tors (white circles). At an equilibrium state, players running toward the same 
direction stay together, and their velocities are denoted by arrows. The last 
three figures present details of the labeled components in (d). To give a clear 
figure of spatial configuration, not all directions of the agents are denoted in 
(e), (f) and (g). 
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